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The National Household Education Survey (NHES) was 
conducted for the first time in 1991 to collect data on the early 
childhood education (ECE) experiences of young children and 
participation in adult education. Because the NHES methodology is 
relatively new, field tests were necessary. A large field test of 
approximately 15,000 households was conducted during the fall of 1989 
to examine several methodological issues. This report analyzes data 
from the Current Population Survey to identify the extent of 
telephone undercoverage for 14- to 21-year-olds and 3- to 5-year-olds 
and bias related to undercoverage for estimates of school dropouts 
and ECE program participation. Methods for adjusting survey estimates 
to reduce this bias partially are developed and evaluated. 
Recommendations are given to improve sampling accuracy for both 
populations. For estimation of 14- to 21-year-olds in the NHES, it 
recommended that the mean adjusted pos ts tra t i f i ed estimator be used 
because it incorporates an additional smoothing over the within-cell 
adjusted estimator. Pos tstratif ication variables that are more 
closely related to household income should be considered for the NHES 
estimation phase, and the use of tenure in addition to or in place of 
some of the other pos tstratif ication variables may be useful in this 
respect. For estimation of 3- to 5-year-olds in the NHES, the 
poststrat if ied estimator appears to perform reasonably well for the 
range of statistics available, and it is recommended for use with 
this target population. Problems concerning undercoverage bias due to 
households without telephones were not substantial. Fourteen tables 
and eight figures present field test findings. An appendix discusses 
the source and reliability of estimates. (SLD) 
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Foreword 

The National Household Education Survey (NHES) 
represents a major new initiative of the National Center 
for Education Statistics (NCES). Between February 
and May of 1991, the NHES was fielded for the first 
time as a mechanism for collecting data on two 
different sectors of education policy interest: the early 
childhood education experience of young children and 
participation in adult education. Because the NHES 
methodology is relatively new and relies on some 
mnovative approaches, a field test of the methodology 
was an essential first step in the development of the 
survey. Many of the me;hods of evaluated during the 
1989 NHES field test were adopted for the ftill-scale 
survey. 

A large field test of approximately 15,000 
households was conducted during the fall of 1989. A 
number of methodological issues associated with 
collecting and analyzing data on education issues from 
a random digit dialing telephone survey were examined. 
This report is one of five that describe the 1989 NHES 
Field Test experience. The five reports are the first in 
a senes of technical publications pertaining to the 
design and conduct of the NHES that NCES hopes to 
contmue in the years to come. NCES believes that the 
reports contained in this series will provide users of the 
NHES data with a better understanding of the NHES 
methodology and that they will assist the survey design 
efforts of others. 

The first report in this series. Overview of the 
National Household Education Survey Field Test 
describes the design of the field test and the outcome^ 
of the field test data collection activities. It reports on 
the response rates obtained, both unit and item, and the 
burden associated with survey participation. Each of 
the next four reports in the series focuses on a specific 
issue that was examined in the 1989 NHES field test. 



The second report. Telephone Undercoverage Bias of 
14- to 21-Year-Olds and 3- to 5-Year-Olds, analyzes 
data from the Current Population Survey to identify the 
extent of telephone coverage for two distinct 
populations of interest and the bias associated with this 
type of undercoverage for estimates of school dropouts 
and eariy childhood education program participation. 
Methods for adjusting survey estimates to partially 
reduce this bias are developed and evaluated. 

The third report. Multiplicity Sampling for Dropouts 
m the NHES Field Test, examines a technique that was 
used to increase the coverage of 14- to 21-year-oIds and 
to capture more dropouts in the sample. The report 
descnbes the effectiveness of the multiplicity sample in 
achieving these goals. 

The fourth report. Proxy Reporting of Dropout Status 
m the NHES Field Test, focuses on measurement errors 
ansmg from the uie of proxy respondents. During the 
1989 Field Test, u knowledgeable household member 
was used as a source of information on the school 
enrollment of each sampled 14- to 21-year-old in the 
household. In addition, 14- to 21-year^lds were asked 
to report on their own school enrollment. The report 
describes the correspondence between the responses 
given by proxy respondents with those provided by the 
youths themselves. 

The fifth report, Effectiveness ofOversampling Blacks 
and Hispanics in the NHES Field Test, describes the 
approach used to increase the number of black and 
Hispanic households/youth in the sample. During the 
field test, an approach that u.ses demographic 
information at the telephone exchange level to develop 
samplmg strata was used to oversample black and 
Hispamc households. The report examines the yield of 
the field test sample design versus that which would 
have been expected without oversampling. The effects 
of oversampling on the precision of survey estimates 
are reported. 
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Introduction 

During the fall of 1989, the Field Test of the 
National Household Education Survey (NHES) was 
conducted by the National Center for Education 
Statistics (NCES) to explore the feasibility of 
collecting education data by telephone from a sample 
of persons in their households. The NHES is tlie 
first nu\jor attempt by NCES to go beyond its 
traditional surveys, which rely upon school-based 
data collection systems and are typically conducted by 
mail or in-person data collection methods. 

A household survey has the potential to 
provide the types of data needed to study current 
issues in education, particularly those which can not 
be adequately addressed through a school-based 
survey. Such issues include dropping out of school, 
adult and continuing education, preschool education, 
the status of former teachers, and home-based 
education. Consequently, the NHES methodology 
may greatly enhance the scope of issues covered by 
the data collection activities of NCES. 

Since the NHES data collection methods 
were untested for education surveys, the Field Test 
was developed to evaluate the use of this approach. 
Two topics of broad policy interest were included in 
the Field Test: the early childhood education 
characteristics of 3- to 5-year-olds, and the 
educational status of 14- to 21-year-olds with a 
special focus on youth who dropped out of school 
before completing high school. By including both of 
these study areas in the Field Test, the ability to use 
tlie NHES to study multiple, complex topics, 
employing different sampling requirements and 
respondent rules could be evaluated. 

Westat, Inc., under contract with NCES, 
conducted all of the Field Test interviews using 
computer-assisted telephone interviewing (CAT!) 
methods. The use of CATI methods made sampling 
respondents for interviews easy and nearly invisible 
to the telephone respondent, an important benefit 
when several persons may be sampled in a 
household. CATI also directed the interviewers 
through complex skip patterns and provided the 
opportunity to incorporate edit checks to help resolve 
inconsistencies in the data while the respondents were 
still on the telephone. Another major advantage of 
the use of CATI was tha^ data analysis could begin 
soon after data collection ended, because data entry 
and many of the edit checks were done during the 
interview. 



The sampling scheme used in the Field Test 
was a variant of the Mitofsky-Waksberg random digit 
dial (RDD) procedure' in "A'hich every residential 
telephone number has the same chance of being 
drawn into the sample. Because of the need for more 
precise estimates of blacks and Hispanics, special 
sampling methods were used to increase the sample 
size for these persons. The design for the Field Test 
was essentially the same as planned for a full-scale 
NHES study, except the overall sample size was 
smaller. 

The sample resulted in collecting data from 
15,037 households representing all civilian, 
noninstitutionalized persons in the 50 states and the 
District of Columbia. Although only persons living 
in telephone households could be sampled for the 
Field Test, adjustments were made in the weights so 
that the estimates of persons living in both telephone 
and nontelephone households could be produced. 

Respondents in sampled households were 
asked a series of screening questions. This 
interview, called the Screener, was used to enumerate 
all the members of the household, determine the 
eligibility of each person in the household for the 
early childhood education (3- to 5-year-olds) and 
youth (14- to 21-year-olds) studies, and obtain some 
data on the characteristics of the household. A total 
of 4,374 households had at least one person 
enumerated in the Screener who was eligible for an 
extended interview. The response rate to the 
Screener was 79 percent. 

The early childhood education interview was 
conducted with the parent or guardian who knew the 
most about each sampled 3- to 5-year-old child's care 
and education. Accordingly, this interview was 
called the Parent Interview. Of the 1,551 children 
identified in the Screener, parents completed 
interviews for 1,530 cliildren, a completion rate of 99 
percent. 

If the household contained any 14- to 21- 
year-olds, tlien a Household Respondent Interview 
(HRI) was attempted for each of these members. 
The HRI was used to determine the current and 
previous educational status of the youth; this 
interview could be completed by any adult household 
member who knew about the educational activities of 
the youth, including self-reports by the youth. Of the 
4,441 youths identified in the Screener, HRIs were 
completed for 4,313 youths, for a 97 percent 
completion rate. As part of a special methodological 
study of multiplicity sampling, mothers in a 
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subsample of the households were asked to complete 
the HRI for their 14- to 21 -year-old children who did 
not live in their household. These youth are included 
in the numbers stated above. 

A Youth Interview (YI) was then attempted 
for G subsample of the 14- to 21 -year-olds in the 
household. All the youths wLo were not currently 
enrolled in school aiid did not have a high school 
diploma or equivalent (as reported in the HRI), and 
a sample ot all other youths, were targeted for the 
YI. The interview contained more detailed items on 
tlie educational experiences of the youth that could 
only be answered by the youth. Of the 1,863 youths 
sampled, 1,604 completed the YI, a completion rate 
of 86 percent. ITiese numbers include a sample of 
133 youths who did not live in the sampled 
households, but were included through the 
multiplicity sample when their mothers completed the 
HRI. 

This report describes research conducted 
prior to the Field Test data collection. The research 
involves the examination of the issues of telephone-- 
coverage for the two populations sampled for the 
Field Test. The Field Test is described in another 
report entitled Overview of the National Household 
Education Survey Field Test, the first in a series of 
reports on the Field Test. The Overvie.w Report 
describes the sample design, the data collection 
methods and instruments, the response rates, and 
other salient aspects of the collection and analysis 
process for the Field Test. 

This research was conducted to understand 
important methodological issues that could not be 
directly addressed from data collected in the NHES 
Field Test. An important concern for any survey is 
the completeness of the survey in terms of covering 
the target population. Every household survey is 
subject to some undercoverage bias, the result of 
some members of the target population being either 
deliberately or inadvertently missed in the survey. 
The discussion of the undercoverage bias in the 
decennial Census of Population is one well-laiown 
example of this problem. A general discussion of the 
problems of undercoverage with references to the 
literature is Madow, Nisselson, and Olkin.^ 

Telephone surveys like the NHES are subject 
to an additional source of bias because only about 93 
percent of all the households in the United States 
have a telephone. Even more problematic is the fact 
that the percentage of households without telephones 



varies from one subgroup of the population to 
another. Massey and Botman^ discuss this problem 
in some detail. 

Since presence of a telephone in a household 
is correlated with variables such as income, 
education, and household size, it is very likely that 
estimates of dropping out of school and eariy 
childhood educational experiences are affected by this 
source of bias. Because of uncertainty on how this 
variability affects statistics to be gathered in NHES, 
a special analysis of the bias associated with 
telephone coverage and its potential impact on 
estimates from the NHES was conducted. 

This report examines the telephone 
undercoverage issue for the 14- to 21 -year-old and 
the 3- to 5-year-old populations separately. The 
research was completed prior to the NHES Field Test 
using existing data from surveys conducted by the 
Census Bureau. For each population, the estimates 
of the magnitude of the bias associated with the 
telephone undercoverage are examined first. Then 
methods for adjusting the estimates to partially reduce 
this bias are proposed and evaluated. 
Recommendations for estimation strategies are 
proposed for each of the populations. 

Source of Data for Analysis 

Each October a supplement (the Edui ation 
Supplement) to the monthly Current Population 
Survey (CPS) is conducted by the Census Bureau. 
Among the supplemental questions are items on the 
current and previous years' school enrollment and 
high school graduation status. These items are 
available on the October public-use file released by 
the Census Bureau. Data on which of the sampled 
households in the CPS have telephones are also 
collected, although this data item was not included in 
the October public release file prior to 1989. 

To construct the data base for the telephone 
undercoverage analysis, we merged the telephone 
status information from the November public-use file 
onto the Education Supplement data from the 1988 
October public-use file using a unique household 
identifier common to both files. A feature of the 
CPS sample design is that a portion of the household 
sample is rotated in or out of the survey from one 
month to the next. For any two successive months 
about 75 percent of households overlap by design and 
about 71 percent of households actually overiap after 
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accounting for nonresponse and persons who moved 
in either of the two months. 

In the analysis which follows using the 
merged October and November data, we have 
compensated for the reduction in sample size by 
inflatbg the weights usiid in estimation to veight 
back up to the fully- aggregated October es'jmate. 
The factor used to inflate the weights to account for 
the reduction in sample size is the ratio of the 
estimated October total population (either of 14- to 
21-year-olds or of 3- to 5-year-olds) to the 
corresponding estimated total population remaining on 
the merged October and November file. The 
con^arison of the estimated number of persons, by 
age, from the merged file with the totals from the 
complete October file indicates that the use of the 
merged file does not significantly distort the 
estimates. 

In a preliminary stage of the analysis, the 
1987 October and November public-use files were 
merged. Some analyses were conducted using the 
merged 1987 files, and the findings of these analyses 
suggested several approaches to reduce the size of the 
bias. To evaluate these approaches, the 1988 public 
use files were merged and used for the analysis 
described in this reiJort. The method used to define 
some dropout characteristics for the 14- to 21 -year- 
old population m the 1988 merged file was slightly 
revised for the 1987 analysis file. All of the results 
reported are from the 1988 file, unless otherwise 
noted. 

Although there arc many variables included 
in the analysis, the telephone status of a household is 
obviously the most critical data element. The data 
element which indicated if a telephone was present in 
a household was missing for less than 0.5 percent of 
the records in the November public-use file. 
Tabulations were made to compare the estimated total 
populations to the sum of the estimated telephone and 
nontelephone household populations. The estimated 
total population was greater than the sum of the two 
estimated components by less than 0.5 percent. The 
difference was mainly due to missing data on the 
telephone status variable. This level of missing data 
does not have an important effect on the estimates of 
coverage for the populations of interest. 

The procedure used to analyze the extent and 
impact of telephone coverage in the two NHES Field 
Test topic areas is to compare the statistics from the 
CPS for all households to the same statistics based 
only on the sample from households with telephones. 
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Different estimation schemes were devised to adjust 
the estimates from the sample of telephone 
households to approximate the estimates from all 
households. 

ITic analysis described below focuses on 
potential biases of telephone surveys arising from 
incomplete telephone coverage. It does not include 
other sources of sampling and nonsampling error. 
The CPS is itself a sample survey and subject to both 
of these types of errors. 

One important source of nonsampling error 
in the CPS is coverage, although not telephone 
coverage since the CPS is conducted in-person 
whenever no telephone is available in the household. 
(The first CPS interview is always conducted in- 
person; subsequent interviews may be conducted by 
telephone if the respondent is willing and a telephone 
is available.) Tlie CPS coverage problem is most 
severe for males between 19 and 24 years old and for 
blacks and Hispanics^ These nonsampling errors 
pose additional problems for estimating characteristics 
for these subsets of the population. 

Another type of nonsamplfng error that 
arises in both telephone and in-person interviev/s is 
the incomplete coverage of household members. 
Research conducted by Miiklan and Waksberg^ 
indicates that within-household coverage is no worse 
for telephone surveys than it is for in-person surveys. 

Undercoverage Bias in Estimates of 
14- to 21-Year-Olds 

The problem associated with telephone 
coverage for 14- to 21-year-olds, especially those 
who left high school without a diploma, is probably 
more severe than it is for any other major population 
subgroup of interest to education policymakers. The 
analysis of the problem begins with the presentation 
of basic estimated totals, dropout rates, and telephone 
coverage rates. After displaying the nature of the 
problem, there are presentations of alternative 
estimators that could be used for the NHES, estimates 
of the magnitude of the bias associated with each 
estimator, and recommendations for implementation. 

First, a brief definition of terms is 
necessary. For this analysis, persons were 
categorized into groups by their enrollment status in 
the current year and the previous year. Four 
categories of enrollment status are used: all persons, 
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those enrolled in school last year, sUtus dropouts, 
and event dropouts. The categorization is not 
exclusive; a person can fall into more than one 
category. First, the two types of dropouts are 
defined. This is followed by a definition of those 
categorized as being in school last year in the context 
of the definitions of dropouts. 

A status dropout is defined as a 14- to 21- 
year-old who was not enrolled in school in October 
of the current year and did not have a high school 
diploma or equivalent. Event dropouts are defined as 
the subset of status dropouts who were enrolled in 
school in October of the previous year. In other 
words, a status dropout is someone who is not 
currently enrolled and does not have a diploma or 
equivalent, and an event dropout is a status dropout 
who left school within the last year. 

Dropout rates can be computed for each of 
these two types of dropouts. The status dropout rate 
is defined as the ratio of the estimated number of 
status dropouts to the estimated number of all 14- to 
21-year-olds. The event dropout rate is defined as 
the ratio of the estimated number of event dropouts to 
the estimated number of 14- to 21-year-olds who 
were enrolled in school the previous October. This 
denominator is used because, in order to be an event 
dropout, the person had to be enrolled in school the 
previous October. 

The estimate of the denominator of the event 
dropout rate had to be constructed from other 
variables because the necessary data for a direct 
estimate were not available in the public release 
files.* The denominator, the number of 14- to 21- 
year-olds in school last year, is defined as the number 
of persons enrolled in school in the previous year 
who did not graduate from high school prior to the 
current year. For example, a person who was 
enrolled in 1987 but had graduated before 1988 was 
excluded from the denominator, assuming the person 
was enrolled in postsecondary education in 1987. 
The exclusion attempts to eliminate those enrolled in 
higher education in the previous years. 

Telephone Coverage for 14- to 21-Year- 
Olds 

The CPS estimates of the number of 14- to 
21 -year-olds living in telephone and nontelephone 
households in October 1988 are shown in table 1. 
The table presents the estimates for several different 



reporting categories depending upon the person^s or 
household's characteristics. Table 2 presents the 
status and event dropout rates for the corresponding 
populations. Totals and rates are both given because 
telephone coverage may affect these statistics 
differently. 

The estimates in tables 1 and 2 are presented 
to describe the basic problems associated with 
restricting the NHES to households with telephones 
and as the foundations for other types of estimates, 
such as the telephone coverage rate. (For complete 
reporting of dropouts statistics, see the NCES report 
by Frase.'^ As described m the previous section, 
estimates in this report are subject to sampling errors, 
and these samplmg errors can be approximated using 
the procedures described in appendix A. 

These tables show very large differences in 
dropout rates between telephone and nontelephone 
households. Both the status and the event dropout 
rates in nontelephone households are over four times 
as large as in telephone households. Figure 1 
displays these relationships graphically, including 
approximate 95 percent confidence intervals. A large 
difference between the estimates of the characteristics 
of persons living in telephone and nontelephone 
households is one of the two conditions necessary for 
producing a significant bias in a survey restricted to 
telephone households. The other condition is having 
a substantial portion of the population excluded from 
the survey because of the absence of a telephone in 
the household. 

The telephone coverage rate is the estimated 
percentage of 14- to 21-year-olds, status dropouts, or 
event dropouts who live in households with 
telephones. The estimated coverage rates for October 
1987 and October 1988 are given in table 3. The 
overall coverage rate for all 14- to 21 -year-olds is 
about 91 or 92 percent, which is very close to the 93 
percent quoted for the entire population. For the 
subgroup of students who were enrolled in school last 
year the coverage is even better, 94 percent. 

The coverage rates for persons classified as 
either status or event dropouts in both 1987 and 1988 
are much lower, with rates varying between about 70 
and 80 percent. The estimated telephone coverage 
rate for status dropouts is approximately 70 percent 
in both 1987 and 1988. The estimated event dropout 
coverage rate is 75 percent in 1988 and 81 percent in 
1987. 
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Figure 1. — Estimated dropout rates by telephone status 
for 14- to 21-year-olds in October 1988 
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Figure 2 disphys the estimated 1988 coverage rates 
by household income and tenure (whether the 
person's home is rented or owned). The graphs 
demonstrate that the coverage rates vary by the 
person's or household's characteristic as well as by 
enrollment status. [The telephone coverage for status 
dropouts (the subgroup with the lowest overall 
coverage) who live in homes that are owned (85.4 
percent) is greater than the telephone coverage rate 
for renters, irrespective of enrollment status (83.7).] 

The estimated telephone coverage rates 
shown m figure 2 (and given in table 3) for 14- to 
21 -year-olds are subject to sampling errors. A 
summary description of the size of these sampling 
errors is useful for evaluating the reliability of the 
differences described. For all 14- to 21-year-olds 
and those enrolled in school last year, the 
approximate standard error of the estimated telephone 
coverage rate is typically less than 1 percent. For the 
telephone coverage estimates of status dropouts and 
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event dropouts, the approximate standard errors of 
the estimates are typically 2 percent and 5 percent, 
respectively. Additional detail on the reliability of 
survey estimates is provided in appendix A. 

These estimates indicate two important 
findings. First, there are large differences in the 
enrollment status characteristics between those in 
telephone and nontelephone households. Second, the 
telephone undercoverage rate is large and varies by 
the characteristics of the person or household as well 
as by enrollment status. 

The implication is that simple estimates of 
the number of dropouts and the dropout rates based 
on a telephone household sample are anticipated to be 
significantly less than an estimate of the same 
quantity if nontelephone households were also 
represented. The effectiveness of simple and 
alternative estimation procedures in reducing these 
biases is discussed in the next two sections. 
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Figure 2. ~ Estimated telephone coverage rates by selected 

characteristics for 14- to 21-year-olds in October 1988 




Percent with telephones 
100 



80 



60 



40 



20 



All 



In school 
last year 



status 
dropouts 



^^^^ 




: Tenure k§^\^; 

[ ; Owned H 

k'S2 Rented ^ 

_ „^ _ i^»XAJ'j!^iiJLl — 

Event 
droDOuts 



Source: Special tabulations of the 1988 October and November CPS 



6 

15 



Estimation Schemes for 14- to 21-Year- 
Olds 

As noted earlier, there is some degree of 
noncoverage bias associated with all telephone 
surveys, not just telephone surveys trying to estimate 
enrollment status or dropout rates. In preparing 
weights for estimation, a typical procedure is to 
calculate base weights which retlect the probabilities 
of selection for each individual respondent, and then 
to adjust the weights for the estimated undercoverage 
and other forms of nonresponse. 

The simplest adjustment for lack of 
telephone coverage is to multiply the estimation 
weights of the telephone households by a constant to 
bring the estimate up to the total for the entire 
population. Since the overall coverage rate for the 
population of 14- to 21 -year-olds is 91.8 percent for 
1988, the simple adjustment estimator is the base 
estimation weight multiplied by 1.09, the inverse of 
91.8 percent. Obviously, better estimation schemes 
are available, but their basic method serves as a 
reference. 

Poststratification, forcing the estimates from 
the survey to match known population totals for 
subdomains from a presumably more accurate 
independent data source, is often a better method to 
make these adjustments. See Holt and Smith^ for a 
discussion of the usefulness of poststratification. One 
of the rationales for using poststratification is that it 
may reduce undercoverage bias. If the persons in a 
poststratification cell are homogeneous with respect 
to the characteristics of interest, then the 
poststratification can reduce the bias in the estimates 
and sometimes even reduce the sampling variability 
of the estimates. 

The poststratification scheme that we 
investigated involved 96 adjustment cells defined by 
age: 14, 15, 16, 17, 18, 19, 20, and 21; crossed 
with race/ethnicity: Hispanic, non-Hispanic black, 
non-Hispanic non-black; crossed with highest grade 
attended by the head of household: grade 8 or less, 
grades 9, 10 or 11, grade 12, any postsecondary 
education. The average number of persons in 
telephone households in the adjustment cells is about 
110. For almost all cells the size exceeds 20 
persons, but in five of the 96 cells the cell size is less 
than 10 (the minimum being 6 persons in one cell). 

The variables that were available for 
poststratification were: age, sex, region, education 



level, tenure, family income, family size, and family 
type. Age and race/ethnicity are variables used in the 
poststratification of the CPS to independent totals. 
Preliminary investigations were conducted and 
variables such as sex and region were determined to 
have little effect on reducing the bias. The variables 
chosen and the number of poststratification cells used 
was a compromise between the power of the variables 
to reduce the bias in estimates and the increase in 
variance associated with using too many cells. The 
96 cells were chosen with the objective of having as 
many cells as possible while maintaining at least 20 
cases in almost all cells. 

Two sets of weighted cell totals were 
produced, the first using the entire analysis file 
(telephone and nontelephone households), the other 
using the telephone household data alone. 
Poststratification factors were computed cell by cell, 
by dividing each telephone household estimate into 
the corresponding full analysis file estimate. A final 
poststratified estimation weight was then computed 
for each 14- to 21-year-old in a telephone household 
by multiplying the CPS telephone file weight by the 
poststratification factor corresponding to the 
appropriate cell. 

The reduction in bias due to poststratification 
depends on the statistic under consideration and the 
population subgroup to which it applies. For 
estimates of the number of status dropouts or event 
dropouts, the reduction in bias will be substantial, 
often 60 percent or more for estimates of event 
dropouts and 45 percent or more for status dropouts, 
depending on the subgroup. For other statistics, such 
as dropout rates, the improvement is far less. 

Up to this point, all the analysis had been 
performed on the 1987 merged file, but it was clear 
that the simple adjustment and the proposed 
poststratification estimation schemes were not 
adequate for producing reasonably accurate estimates 
of the number of dropouts or the dropout rate for a 
given year. The accuracy of estimates of year-to- 
year changes in the dropout statistics, which are 
presumably not subject to as large a bias as the 
estimates of the current level, was not evaluated. 

New approaches to the estimation of the 
characteristics of dropouts were suggested by this 
preliminary analysis of the 1987 merged data. The 
new procedures were implemented using the 1988 
data, and are defined below. In order to adequately 
describe these new approaches, it is useful to present 
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the estimation formulae for both *he simple 
adjustment esCimator and the poststratified estimator. 

The simple adjustment estimator can be 
written as 



where 



P = 



T = 



(3.1) 



the estimated number of all 14- to 21-year- 
olds in CPS; 

the estimated number of 14- to 21 -year-olds 
in telephone households in CPS; 

the base weight of person i (a 14- to 21- 
year-old i in a telephone household); and 



yi = the characteristic of person i. 

In the application of this estimator to NHES, 
the denominator of the adjustment factor, T, would 
be replaced by the estimated number of persons in 
telephone households as estimated from NHES. The 
appropriate estimation equation is then 



(3.2) 



The equation for the poststratified estimator 
is very similar, except the adjustment factor is 
created for each poststratification cell. The equation 
for use in the NIIES is 



y ■ 

cy cx 



(3.3) 



where the only new notation is the subscript c which 
denotes poststratification cell c. 

As noted above, the basic assumption 
underlying the poststratification adjustment is that 
within each poststratification cell, the covered and 
noncovered populations have the same mean value for 
the characteristic being estimated, or more broadly, 
the characteristic has the same distribution in the two 
groups. The analysis suggests that the means for 
these two populations are not equal and, therefore, 
the basic distributional assumption of poststratification 
does not apply for the poststratification scheme thus 
far investigated for the NHES. 



An alternative procedure that we studied is 
to use the CPS October supplement to develop 
differential adjustments within each of the poststrati- 
fication cells based upon other characteristics of the 
persons. This procedure is not a poststratification 
scheme with smaller adjustment cells because the 
estimates are not forced to equal the within-cell totals 
from the CPS. 

The use of a smaller cell poststratification 
scheme was not examined for several reasons. First, 
poststratification based upon cells with small sample 
sizes tends to inflate the sampling errors of the 
estimates rather than reduce them. This increase in 
variance can be substantial. Second, the independent 
poststratification totals are supposed to be known, or 
at least subject to much less variability than the 
survey estimates. When small poststratification cells 
are used, the cell totals derived from the CPS are 
subject to relatively large sampling errors. Finally, 
additional bias may be introduced into the estimates 
if the survey estimates of the characteristics used to 
form the smaller cells do not exactly correspond to 
the estimates from the CPS. 

The alternative approach adopted for this 
analysis was to compute the ratios of the number of 
all persons to persons in telephone households from 
the 1987 data for critical enrollment status categories 
within each of the 96 poststratification cells and then 
use these ratios to adjust the weights from the 1988 
data. By usmg the 1987 data to define the cells and 
prepare the adjustment factors applied to the 1988 
data, the possibility of overestimating the effective- 
ness of the procedures was avoided. 

Up to three adjustment categories were 
created for each poststratification cell consisting of 
whether the person was enrolled in school, was not 
enrolled and had a diploma, or was not enrolled and 
did not have a diploma. Tliis variable is called 
"INSCHOOL." 

Because the ratios varied considerably and 
were based upon small sample sizes for many cells, 
a smoothing technique was used to reduce the 
consequences of the variability of the ratios. An 
empirical Bayes procedure suggested by Efron and 
Morris' was used to smooth the ratio., ay single year 
of age and adjustment cell variables. 

The estimation equation for the within-<iell 
poststratified adjustment estimator is the same as the 
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ordinary poststratified estimator except that within 
each poststratification cell the base weight is adjusted 
by a ratio depending on the category of INSCHOOL. 
The equation is 



^ within 

c j i 



w ■ r . y ■■ 

CJI CJ CJl 



(3.4) 



where r^ is the ratio adjustment and j denotes the 
INSCHOOL category. This estimator is referred to 
as the within-cell adjusted poststratified estimator. 
Slight modifications of this estimator could be 
considered, such as using a 3 -year average of the 
adjustment factoid from the CPS. It is also 
reasonable to assume that this type of estimator 
would be updated annually. 

Another estimator was considered because of 
the concern that the within-poststratification cell 
adjustment ratios might still be so variable as to 
increase the variability of the estimates. A mean of 
the adjustment cell ratios was developed across all the 
poststratification cells within a single year of age and 
INSCHOOL value. The equation for the mean 
adjustment estimator is identical to the within-cell 
poststratified adjustment estimator, except the ratios 
are constant within groups of poststratification cells 
in a single age. The equation is 



c j i 



CJl CJ ^ CJl 



(3.5) 



where the prime on r (r') indicates that it is the mean 
across poststratification cells within a single year of 
age and value of INSCHOOL. This estimator is 
referred to as the mean adjusted poststratified 
estimator. 

The mean adjusted and the within-cell 
adjusted poststratified estimates are not fully 
satisfactory in the sense that they use adjustment 
factors derived from historical (the previous year) 
data. An estimator based upon raking NHES data to 
marginal totals from the CPS is an alternative 
estimation scheme that could be considered. The 
raking estimator was not included in this study 
because these computations were completed before 
the Field Test had finished data collection. The 
raking estimator could not be evaluated until data 
collection was completed and the comparability of the 
characteristics estimated from the different surveys 



could be conducted. Based on the data collected in 
the Field Test, some general comments on the raking 
estimates with applications to the estimates of 
dropouts are possible. These comments are included 
in the section on recommendations. 



Comparison of Estiraates 

Estimates of the characteristics of 14- to 21- 
year-olds were computed using each of the four 
estimation schemes described above. The estimates 
were then divided by the estimated total derived from 
the regular CPS estimate, which is the sum of the 
estimates of those living in telephone and 
nontelephone households. If the estimate is identical 
to the regular CPS estimate, the quotient should be 
unity. The quotients were multiplied by 100 in order 
to simplify the exposition. The ratio for each of the 
four estimators can be expressed as 



yk 



^ cps 



(3.6) 



where k indicates the estimator used (simple 
adjustment, poststratified, mean adjusted 
poststratified, and within-cell adjusted poststratified), 
and the CPS in the denominator denotes the sum of 
the CPS estimates for the persons in telephone and 
nontelephone households. 

The ratios for estimates of totals are given in 
table 4. The same process was followed to produce 
ratios for estimates of dropout rates and these are 
given in table 5. A ratio of 100 indicates that the 
estimate is exactly equal to the value of the estimate 
from the regular CPS. A value of less than 100 
indicates that there is a residual downward bias in the 
estimate. Conversely, a value of greater than 100 
indicates that the estimator has overcompensated for 
telephone coverage bias and there is a residual 
upward bias in the estimate. 

The ratios for estimates of both totals and 
rates are provided because of the possibility that the 
estimation process might affect the two types of 
statistics differentially. In fact, estimates for totals 
Eodght be worse under some estimation schemes 
which improve the estimates for rates. As shovm in 
the tables, this situation does not arise in the 
proposed estimation schemes. 
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Because there are so many values to examine 
in the tables, summary statistics and histograms of 
the key statistics have been produced to help in the 
analysis. Table 6 summarizes the values of the 
estimates of the ratios of totals appearing in table 4. 
In estimating totals for the number of status dropouts 
and the number of event dropouts, the ratios indicate 
that the simple adjustment estimator is very poor (a 
downward bias of over 20 percent). This finding 
agrees with the original analysis of the 1987 CPS 
data. 

The adjusted poststratified estimators 
perform much better for event and status dropout 
estimates than those that rely only on 
poststratification. The average for the ratios of the 
mean adjusted estimator are slightly closer to 100 
than is the wi thin-cell adjusted estimator. The 
variabilities for the within-cell adjusted estimators (as 
measured by the standard deviation and range) do not 
exceed those of the mean adjusted estimators. It 
should be noted that, since the values of the estimates 
in table 4 are not independent, the standard deviation 
and ranges are used simply to give some idea of the 
spread in the values. 

There are two reasons which might explain 
why the estimates of the status dropouts and status 
dropout rates might be improved by the alternative 
estimation procedures more than the estimates for the 
event dropouts and event dropout rates. First, there 
are more status dropouts than event dropouts and 
more persons in the denominator of the status dropout 
rate than in the denominator of the event dropout 
rate. Because of the increased size the estimates of 
status dropouts are more stable, especially from year 
to year. 

Second, the rotation scheme used in the CPS 
means that about half of the sample is repeated in 
1987 and 1988. For the merged October and 
November files of 1987 and 1988 about one-third of 
the sample should be common to both years. Many 
of the persons who are status dropouts in 1987 will 
still be status dropouts in 1988. Since one-third of 
the sample overlaps between the two years, a fair 
correlation over time may improve the stability of the 
ratio adjustments for the status dropout more so than 
the event dropouts. 

The summary statistics for the ratios of the 
rates from table 5 are given in table 7. The simple 
adjustment and the poststratified estimators again 
perform very poorly. The mean adjusted and the 
within-cell adjusted estimators are close to 100. The 



mean adjusted estimator is marginally closer to 100 
for the event dropout rates, and the within-cell 
adjusted estimator is marginally closer to 100 for the 
status d ropout rates . Once again there i s no 
indication of an increase in the variability in the 
ratios for the within-cell estimator as compared with 
the mean adjusted estimator. 

Figure 3 is a series of histograms of the 
values of the ratios of the estimated dropout totals 
from table 4. Looking down the page from the 
histogram of the simple estimator to the adjusted 
poststratified estimators shows the movement of the 
estimates to being more centered about the value of 
100 and also more concentrated about the mode. 
Figure 4 shows the histograms for the values of the 
ratios of the estimated dropout rates from table 5. 
The conclusions from this figure are consistent with 
those from figure 3. 

Despite the good performance of the adjusted 
estimators, it should be noted that they consistently 
overestimate the number of dropouts and the dropout 
rates for an important component of the population, 
those persons in households with incomes above 
$20,000. This result suggests that different 
poststratification cells which incorporate income 
levels more directly should be considered for the 
NHES. 



Recommendations for Estimation of 14- to 
21-Year-Olds in NHES 

The improvements in the estimates for 
dropout statistics based upon the adjusted 
poststratified estimators are substantial. There is 
little evidence to suggest that either of the two 
adjusted poststratified estimators is better than the 
other, based on the analysis of the 1988 CPS data. 
Tlie recommendation is to use the mean adjusted 
poststratified estimator because it incorporates an 
additional smoothing over the within-cell adjusted 
estimator. 

The analysis also suggests that 
poststratification variables that are more closely 
related to household income should be considered for 
the estimation phase of NHES. The use of tenure 
either in addition to or in place of some of the other 
poststratification variables may be useful in this 
respect. Since the number of households in the Field 
Test was only about one-third the size of the CPS, 
different poststratification cells are required for the 

10 



19 



Figure 3. - Histograms of ratios of aiternative estimates to CPS estimates for 
dropout totais 
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Figure 4. - Histograms of ratios of alternative estimates to CPS estimates for 
dropout rates 
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Field Test. 

Because of the problems associated with 
using historical data in adjusting the estimates from 
a sample survey, an alternative estimation scheme 
using raking rather than poststratification may be 
preferred. By raking NHES estimates to CPS totals, 
we may accomplish the same gains as shown by the 
adjusted poststratification methods and also take 
advantage of the bias reducing potential associated 
with variables such as income. The raking estimator 
fits into already established sampling theory and this 
has obvious advantages. 

Further research on the raking estimator can 
be undertaken based on estimates from the Field Test 
of the NHES. One important consideration for 
raking is that the estimates from the survey must be 
consistent with those from the independent data 
source (CPS, for example). The analysis of the Field 
Test data shows that the enrollment characteristics 
could be used in raking since the estimates from both 
sources are consistent. This fmding suggests that the 
different data items used in the two surveys do not 
cause significant variability in these estimates. 
Therefore, raking NHES estimates to CPS estimates 
of enrollment status may be very beneficial. Of 
course, other factors including sample size and the 
stability of the CPS estiniates must also be considered 
when raking is proposed. 

Another area of research that might be 
considered is the impact of the poststratification and 
bias adjustments to estimates of cnange over time. If 
the NHES is structured as a periodic survey on the 
same topics, the estimates of change may be as 
important or more important than estimates of current 
level. Additional research into the relationship 
between the estimation scheme and measures of 
change could be important in these circumstances. 

Undercoverage Bias in Estimates of 
3- to 5-Year-Olds 

The extent of the bias arising from the lack 
of telephone coverage for estimating education-related 
characteristics of 3- to 5-year-olds was not expected 
to be as large as the bias in estimating characteristics 
of 14- to 21 -year-old dropouts. However, studies on 
telephone coverage in the United States have shown 
that the age group with the lowest telephone coverage 
is persons \mder 6 years old. These findings 
suggested that research into the biases associated with 



telephone coverage for 3- to 5-year-olds would be 
usefnl for NHES. 

The October CPS Education Supplement 
does not contain many data items on persons between 
3 and 5 years old. The elements in the supplement 
that are most pertinent to the education issues 
addressed by the NHES are enrollment in any type of 
school, enrollment in nursery school, and enrollment 
in kindergarten. The percentage of 3- to 5-year-olds 
enrolled in any type of school, nursery school, and 
kindergarten is computed by dividing the appropriate 
estimated total by the estimated number of 3- to 5- 
year-olds. 

Telephone Coverage for 3- to S-Year-Olds 

The CPS estimates of the number of 3- to 5- 
year-olds living in telephone and nontelephone 
households are shown in table 8. The table presents 
the estimates for many of the same reporting 
categories used in the earlier analysis of 14- to 21- 
year-olds. Table 9 shows the percent of 3- to 5-year- 
olds who are enrolled, enrolled in nursery school, 
and enrolled in kindergarten by the same reporting 
categories for all households, telephone households, 
and nontelephone households. 

These tables show that the percentage 
enrolled, the percentage enrolled in nursery school, 
and the percentage enrolled in kindergarten do not 
vary considerably between those in telephone 
households and nontelephone households. The 
relatively consistent patterns for telephone and 
nontelephone households can be seen in figure 5 
which gives the percentage of 3- to 5-year-olds 
enrolled by telephone status, along with approximate 
95 percent confidence intervals. The only difference 
which is statistically significant is the percentage in 
nursery school. 

As noted earlier, one of the two conditions 
necessary for producing a large bias in a survey 
restricted to telephone surveys is a sizeable difference 
between the estimates of the characteristics of the 
persons in telephone households and nontelephone 
households. Based upon the enrollment estimates 
available from the CPS, this condition does not 
appear to be satisfied for the characteristics of 3- to 
5-year-olds studied. 

The estimated telephone coverage rates for 
3- to 5-year-olds for 1987 and 1988 are shown in 
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Figure 5. — Estimated percentage enrolled, by telephone 
status for 3- to 5-year-olds in October 1988 
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table 10. The overall estimated coverage rate for 3- 
to 5-year-olds is only about 88 percent, which is 
lower than the 93 percent telephone coverage rate for 
persons of all ages. When we examine the telephone 
coverage rates for enrolled 3- to 5-year-olds, it is 
clear that this subgroup of the population has slightly 
greater telephone coverage than the overall average. 
This situation is the converse of what was observed 
for the 14- to 21-year-olds. The 3- to 5-year-olds in 
nursery school have the highest telephone coverage 
rate, about 6 percent above the estimate for all 3- to 
5-year-olds. The telephone coverage rate for those in 
kindergarten is estimated to be only slightly greater 
than the overall rate. 

The high coverage rates for these subgroups 
indicate that the problems associated with bias will 
probably be relatively small for this population. It is 
possible that poststratified estimates might result in 
overestimates of the characteristics of interest. This 
possibility is remote because the coverage rates for 
the subgroups do not vary substantially from the 
coverage rates for all 3- to 5-year-olds. 



Figure 6 displays the estimated telephone 
coverage rates of 3- to 5-year-olds by tenure and 
household income. The graphs show that the 
telephone coverage rates vary considerably by these 
characteristics that are related to wealth. The graphs 
also show that the relationship between the percentage 
of households with telephones is relatively constant 
across the enrollment categories of the 3- to 5-year- 
olds. 

The estimated telephone coverage rates 
shown in figure* 6 and table 10 are subject to 
sampling errors. For all 3- to 5-year'Olds and those 
enrolled in school, the approximate standard error of 
the estimated telephone coverage rate is typically less 
than 1.5 percent. For the telephone coverage 
estimates of those in nursery school and those in 
kindergarten, the approximate standard error of the 
estimate is typically less than 2 percent. Details on 
these approximations are given in appendix A. 

The results from the analysis of the 3- to 5- 
year-old data are very different from the results from 
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Figure 6. — 



Estimated telephone coverage rates, by selected 
characteristics, for 3- to 5-year-olds in October 1988 
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the 14- to 21-year-old analysis. The percentage 
enrolled does not vary much between the persons in 
telephone households and those in nontelephone 
households. Furtherraore, the estimated telephone 
coverage for the enrolled persons is slightly greater 
than the telephone coverage rate for all 3- to S-year- 
olds. The estimated telephone coverage does vary 
considerably by the characteristic of the person or 
household, but does not vary much by enrollment 
status within the characteristic. Tliese findings imply 
that a survey restricted to telephone households may 
not introduce large biases due to telephone 
undercoverage, even if relatively simple estimation 
procedures are used. The following sections examine 
the effects of various estimation schemes and the 
coverage bias remaining. 



than 10 (the minimum being 9 persons in one cell). 
The ix)ststratified estimator can be written in the 
same form as equation 3.3. 

The analysis of the 1987 merged file (not 
shown in this report) indicated that the bias associated 
with the lack of coverage for nontelephone 
households was not a severe problem, and both the 
simple adjustment estinaator and the poststratified 
estimator were reasonably adequate for handling the 
problem. Therefore, no alternative estimation 
schemes were investigated for 3- to S-year-olds for 
NHES. 



Comparison of Estimates 



Estimation Schemes for 3- to Year-Olds 

The findings discussed above suggest that an 
ordinary poststratification estimator should be 
adequate to provide estimates of reasonable accuracy 
for the educational characteristics of 3- to 5-year-olds 
in NHES. For this reason the estimation schemes 
studied for this population were restricted to the 
simple adjustment estimator and the ordinary 
poststratification estimator. 

The simple adjustment estimator for the 3- to 
5-year-old estimates was created by multiplying the 
estimation weight for the telephone households by the 
inverse of the telephone coverage rate. The 
telephone coverage rate for the 3- to 5-year-olds is 
estimated as 0.884, so the simple adjustment 
estimator is the base telephone estimation weight 
multiplied by 1 . 13. The simple adjustment estimator 
for 3- to 5-year-olds can be written the same as 
equation 3.2. 

The poststratification scheme used for 
estimating the characteristics for the 14- to 21 -year- 
olds was also used for the 3- to 5-year-olds to make 
the process simpler. The poststratification scheme 
for the 3- to 5-year-olds involved 36 adjustment cells 
defined by age: 3, 4, and 5; crossed with 
race/ethnicity: Hispanic, non-Hispanic black, and 
non-Hispanic non-black; crossed with highest grade 
attended by the head of household: grade 8 or less, 
grades 9, 10 or 11, grade 12, any postsecondary 
education. The average number of persons in 
telephone households in the adjustment ceils is about 
110. For almost all cells the size exceeds 20 
persons, but in one of the 36 cells the cell size is less 



The procedures used for comparing the 
different estimators for the 14- to 21-year-olds were 
also used for the two estimators for the 3- to 5-year- 
olds. The estimates for each of the two estimation 
schemes were multiplied by 100 and then divided by 
the estimated total derived from the regular CPS 
estimate, which is the sum of the estimates of those 
living in telephone and nontelephone households. If 
the estimate is identical to the regular CPS estimate, 
then the quotient should be 100. The ratio for each 
of the two estimators can be expressed as 



/?4 = 100 



(4.1) 



where k indicates the estimator used (simple 
adjustment, or poststratified), and the CPS in the 
denominator denotes the sum of the CPS estimates 
for the persons in telephone and nontelephone 
households. 

The ratios for estimates of totals are given in table 
11. The same process was followed to produce ratios 
for estimates of the percent of 3- to 5-year-olds 
enrolled and these are given in table 12. A ratio of 
100 indicates that the estimate is exactly equal to the 
value of the estimate from the regular CPS. A value 
of less than 100 indicates that there is a residual 
downward bias in the estimate. Conversely, a value 
of greater than 100 indicates that the estimator has 
overcompensated for telephone coverage bias and 
there is a residual upward bias in the estimate. 

Summary statistics and histograms of the key 
statistics have been produced to help in the analysis. 
Table 13 summarizes the values of the estimates of 



the ratios of totals appearing in table 11. In 
estimating totals for the number of 3- to S-y car-olds 
enrolled, the ratios indicate that the simple adjustment 
estimator is reasonable. The mean and median of the 
ratios are relatively close to 100. This finding agrees 
with the original analysis of the 1987 CPS data. 

The poststratified estimtator performs some- 
what better than the simple estimator. The averages 
for the ratios of the poststratified estimator are 
slightly closer to 100 than the simple estimator, The 
variability for the poststratified estimator (as 
measured by the standard deviation and range) is also 
somewhat smaller than the variability of the simple 
adjustment estimator. 

The summary statistics for the ratios of the 
percentage enrolled from table 12 are given in table 
14. The simple adjustment and the poststratified 
estimators again perform well. The averages of the 
ratios for the poststratified estimator are closer to 100 
than the averages for the simple adjustment estimator. 
The variability of the poststratified estimator is not 
clearly smaller than that of the simple adjustment 
estimator for these statistics. 

Figure 7 is a series of histograms of the 
values of the ratios of the estimated enrollment totals 
from table 11. Looking down the page from the 
histogram of the simple estimator to the poststratified 
estimator shows the slight movement of the estimates 
to being more centered about the value of 100 and 
also more concentrated about the mode. The amount 
of bias reduction due to the poststratified estimator is 
much smaller for the 3- to 5-year-old estimates than 



it was for the 14- to 21 -year-old estimates mainly 
because the simple adjustment estimator does so 
much better for the 3- to 5-year-old estimates. This 
relationship can be seen by comparing the histograms 
in figure 3 to those in figure 7. 

Figure 8 shows the histograms for the values 
of the ratios of the estimated percent enrolled from 
table 12, Tfiis figure indicates that both the simple 
and poststratified estimators are reasonable and differ 
only slightly. The poststratified estimator does 
slightly reduce the overestimation of the percentage 
enrolled which results from the use of the simple 
adjustment estimator, especially for number of 
persons enrolled in households with incomes above 
$20,000. 



Recommendations for Estimation of 3- to 
5- Year-Olds in NHES 

The poststratified estimator appears co 
perform reasonably well for the range of statistics 
that were available from the CPS for the 3- to 5-year- 
old population. The poststratified estimator is 
recommendexl for use with this target population in 
the NHES. The problems associated with 
undeicoverHge bias due to households without 
telephones do not appear to be substantial for this 
target population. Because of the paucity of data in 
the CPS on the education and care of 3- to 5-year- 
olds, it would be useful to consider other data sources 
before finalizing this analysis. 
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Figure 7. - Histograms of ratios of estimates to CPS estimates for enrollment 
totals 
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Figure 8. - Histograms of ratios of estimates to CPS estimates for enrollment 
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APPENDIX 



Source and Reliability of EsUmates 



The estimates contained in this report are 
derived from samples and are subject to sampling and 
nonsampling errors. Sampling errors occur because 
the data are collected from a sample of the population 
rather than from the entire population. If the entire 
population were enumerated, there would not be any 
sampling error. The differences in the estimates due 
to the fact that only a sample has been observed are 
referred to as sampling errors. 

Nonsampling errors come from a variety of 
sources and affect all surveys, even surveys which 
enumerate the entire population. This report has 
concentrated on one type of nonsampling error for 
the NHES: the nonsampling error arising because 
households without telephones are eliminated from 
the survey. The CPS is also subject to nonsampling 
error arising from sources such as coverage, other 
types of design decision, data collection procedures, 
processing procedures, and reporting procedures. To 
the extent possible, procedures are built into surveys 
to minimize nonsampling errors. 

The standard error of an estimate is a 
measure of the sampling variability associated with 
that estimate. The standard error can be used to 
construct confidence intervals which are ranges that 
would include the average result of all possible 
samples with a known probability. In other words, 
if all possible samples were selected, then about 95 
percent of the intervals constructed by taking the 
sample average and adding or subtracting two times 
the sample standard error would include the average 
over all the possible samples. 

The approximate standard error for an 
estimate from the CPS may be computed using the 
following formulas, as suggested by the Bureau of the 
Census ("School Enrollment-Social and Economic 
Characteristics of Students: October 1986," Current 
Population Reports^ Series P-20, No. 429): 



Number of persons 



where 



T 



Percentage of persons 
s.e.iP) 



, -P(100-P) 



X = the estimated number of persons with the 
characteristic; 

T = the estimated total population in the 
category; 

P = the estimated percentage of persons with the 
characteristic; and 

b = 2,312 for total or white population 14 to 21 
years old 

2,600 for Black or Hispanic population 14 to 
21 years old 

2,698 for all populations 3 to 5 years old. 

The approximate standard error for a 
proportion, such as the telephone coverage rate, can 
be computed using the following formula: 



Ratio of persons 



t.8 
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where r = X, the ratio of two estimates. 
Y 

These approximations for the standard errors 
of estimates wers computed based upon the full 
household sample from the CPS. Since only about 
71 percent of the sample is used in the analysis of the 
undercoverage, the approximations may 
underestimate the standard errors of the estimates. 
One rough method to con:^^sate for the reduced 
sample size is to increase the parameter b by a factor 
of 1.4. 
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